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CD 4 PRODUCTION IN PICHIA PA8TORI8 

Field of the Invention 

This invention relates to the field of 
recombinant DNA technology. More particularly/ the 
invention concerns the development of Pichia pastorjs 
yeast strains capable of high-level production and 
secretion of at least a portion of the human T-cell 
receptor molecule CD4 (also referred to as T4 protein) 
containing the site of interaction between CD4 and the 
human immunodeficiency virus HIV. 



Background of the Invention 

The CD4 protein is a glycoprotein of 
approximately 60, 000 daltons molecular weight that is 
express ed on the cell membrane of the mature , thymus - 

15 derived (T) lymphocytes, and to a lesser extent on cells 

of the monocyte/macrophage lineage. The CD4 molecule 
consists of four tandem extracellular domains which 
contain significant sequence and structural homology with 
the variable (V) and joining (J) regions of 

20 immunoglobulin gene family members, a single membrane- 

• spanning domain, and a carboxy-terminal cytoplasmic 
segment. 

The molecule was originally described as a 
marker distinguishing the helper/inducer subset of mature 
25 T lymphocytes [Reinherz et„aL., Cell JL£, 821 (1980); 

Goldstein et al. . Immunol. Review §8, 5 (1982) ], and is 
known to be involved in the interaction of these cells 
- with components of the immune system that express class 
II major histocompatibility complex (MHO) antigen 
30 molecules [see, for example, Swain, Immunol - Review 74 . 

129 (1983) ; Gay et^al^., Nature 328, 626 (1987) ? Doyle 
et al. , Nature 330 , 256 (1987)]. 

The isolation and nucleotide sequence of a cDNA 
encoding the CD4 surface glycoprotein was first reported 
35 by Maddon et al. (Columbia University), Cetll 93 
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(1985) and is disclosed in the PCT Patent Application 
Publication No. WO 88/01304 • However, the published 
sequence proved to be incorrect at its N-terminus due to 
a sequencing error in which the AAG codon (nucleotides 
5 151-153 in the cDNA clone pT4B) was reported as AAC. 

Accordingly, the authors originally predicted an 
asparagine (asn) at the +3 position. Subsequently, 
Richard Axel's group at Columbia University resequenced 
the pT4B cDNA and sequenced three cDNAs from different 

10 libraries, as well as genomic clones encoding CD4 . They 

have found that CD4 actually contains lysine (lys) at the 
asn assignment, and that the residue designated 
originally as +3 is, in fact, the amino-terminal residue 
[Littman et al. in Cell 55, 541 (1988) ]. 

15 of immediate interest, is the finding that the 

human CD4 protein binds the human immunodeficiency virus 
(HIV) , the causative agent of AIDS; arid it is believed 
that the HIV virus gains entry to the cells through 
interaction with the CD4 "receptor". Amongst the 

20 earliest publications concerning the interaction of the 

CD4 molecule and the HIV virus are, for example: 
Dalgleish et al . . Nature 312 , 763 (1984) ; Klatzman 
et al . . Nature 312 . 767 (1984); McDougal et al . . 
- J> Immunol. 135 , 3151 (1985); and McDougal et al . . 

25. Science 231 , 382 (1986) . 

Recent reports have described transfection of 
CD4* cells with CD4-encoding DNA, and the subsequent newly 
acquired ability of transformed cells to bind HIV and 
become infected. Thus, Maddon et al . f Cell 47 P 333 

30 (1986) described the recombinant expression of the CD4 

(T4) gene -in human lymphoid and epithelial cells, and the 
ability of previously T4~ cells to bind to and become 
infected with HIV, after they had been transformed with 
recombinant vectors and thereby became T4* cells. The 

35. authors also showed that recombinant human CD4 on mouse 

cells did not allow for HIV infection, although it bound 
HIV. 
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Further publications concerning the recombinant 
production of soluble CD4 protein, and the ability of the 
recombinant product to interact with HIV and thus inhibit 
infection are: Smith et al. . Science 223., 1704 (1987) ; 
5 Fisher et al, . Nature ££1, 76 (1988) ; Hussey et aL , 

Nature 331 f 78 (1988); Deen et^U, nature 321, 82 
(1988) ; and Traunecker et al. . jNaturg 3?1, 8,4 (1988) . A 
concise review on the subject of HIV/CD4 interaction has, 
for example, been published by Q. J . Sattentau and R. A. 

10 Weiss in Cell 52, 631 (1988) . 

Smith et al , r Supra produced soluble, secreted 
forms of the CD4 antigen molecule by trans feet ion of 
mammalian (CHO) cells with vectors encoding truncated 
versions of CD4 , in which the transmembrane and 

15 cytoplasmic domains were replaced with, a short linker 

sequence containing an in-frame stop codon. The authors 
worked under the assumption that the deduced amino acid 
sequence of CD4 as originally published by Maddon et -ql t , 
was correct. 

20 Fisher et al, , Supra constructed/three 

truncated CD4 genes that lacked the txansmenibrane arid 
cytoplasmic domains, and produced recombinant soluble CD4 
protein in dihydrof olate reductase (DHFR) —mutant CHO 
cells. The authors discovered the discrepancy in the CD4 

25 amino acid sequence when they sequenced their own cDNA, 

but attributed it to a possible allelic polymorphism and 
chemically changed the AAG codon to an AAC codon to 
obtain a CD4 protein sequence "identical to that 
previously reported" (page 331). 

30 To produce a secreted form of CD4, Hussey 

et aL , Supra report the expression of truncated CD4 gene 
in Spodootera f ruaiperda (SF) cells, using a baculovijrus 
(AcNPV) expression system. Milligram quantities of a >, 
hydrophilic extracellular segment of CD4 were ^ejverated. 

35 Deen et al . Supra described an expression 

system in which a recombinant, soluble form of CD4 was 
secreted into tissue culture supernatants . Supernataiyts 
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from clones were monitored for the expression of soluble 
CD4, among others , by Western blot analysis using a 
rabbit anti-CD4 polyclonal antibody "developed against a 
denatured CD4 protein produced in bacteria" (page 82) • 
§, Traunecker et al. , Supra report the production 

and secretion of two soluble chimeric CD4 proteins from 
myeloma cells. The secreted proteins retained "at least 
some of their original conformation" (page 84). 

The specific sequences of CD4 and the HIV virus 
10 that are required for interaction have also been 

identified. 

The component of HIV that mediates binding of 
the virus to CD4 is the surface glycoprotein, gpl20 
[Lasky et al. . Cell 50 . 975 (1987)]. To date, antibodies 

15 raised against gpl20 have been ineffective in blocking 

viral infection either in vitro or in vivo . The inability 
to block is probably related to the heterogeneity seen 
among gpl20 protein sequences from different viral 
isolates. Antibodies raised against gpl20 from one HIV 

20 isolate will not necessarily recognize gpl20 from a 

different isolate. Also, the CD4 -binding region of gpl20 
is not accessible to antibody molecules and thus may be 
capable of binding CD4 even if antibody does bind gpl20. 

Studies using monoclonal antibodies to CD4 have 

25 identified the first variable region, V 1# comprising the 

N-tenninal 106 amino acids of mature CD4, as the site of 
interaction with gpl20 [Berger et al . . PNAS 85 . 2357 
(1988)]. Further analyses of binding, using mutant CD4 
proteins , truncated derivatives of CD4 , HIV, and purified 

30. gpl2 0 have narrowed this assignment to amino acid 

residues 40-48 within V, [Peterson and Seed, Cell 54 . 65 
(1988)]. However, other conserved structures within V 1 
are probably essential to achieve the highest affinity 
binding of HIV to CD4 . 

IB The affinity of the HIV virus for CD4 makes 

this molecule a rational target for development of an 
effective AIDS therapy or prevention. It might be 
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possible to block the entry of HIV into CD4— expressing Tr 
cells through the use of anti-CD4 antibodies or the 
presence of excess soluble CD4 molecules. In the case of 
antibodies, the CD4 receptor present oh the T-cell 
5 surface may be unable to bind the HIV gpl20 "ligand" if 

the CD4 is first bound by antibody. Alternatively, if an 
excess of soluble CD4 molecules is present in a sample 
comprised of CD4-expressing T-cells and HIV, then a large 
proportion of virus might bind to the soluble CD4, be 

lfi. inhibited from binding the cell-associated receptor, and 

viral infection might be lessened or prohibited. 

Chao et al. . J. Biol. Chem. 264 . 5812 (1989) 
expressed a gene encoding a 113-amino acid, NH 2 -terminal 
fragment of CD4 (rsT4. 113) in coli under the control 

15 of the E. coli tryptophan operon promoter. Ah insoluble 

product that is found in inclusion bodies, is obtained at 
5 to 10% of total protein, the purification of which 
provides the recombinant peptide at less than 1 20% of the 
starting material. The product, unlike the naturally 

20 occurring CD4 contains an unblocked N-terminal methionine 

group. 

In view of the promising therapeutic results, 
there is a great need for a recombinant expression system 
that is suitable for the efficient, large^scaie 

25 production of a soluble, authentic form of the CD4 

protein, that, after purification, is suitable for use in 
possible AIDS therapies and preventive measures. 

The Pichia pastoris yeast expression system, 
developed in part by scientists at SIBIA, the assignee of 

30 the present patent application, has proved to be 

instrumental in the production of several heterologous 
proteins. This system is based on methanol -regulated 
promoters and high cell density fermentation. Because 
Pichia pastoris is a methyl otrophic yeast, it has 

35 metabolic pathways that respond to and regulate methanol 

utilization. A key enzyme in the methanol utilization 
pathway is alcohol oxidase, a protein encoded by two 
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genes, AQXl and AQX2 . When Pichia cells are grown in the 
presence of methanol, the AQXl and AQX2 genes are 
transcribed and a large amount of alcohol oxidase protein 
is produced. The high level of AOX gene expression is 
5 mainly due to the roethanol-responsive AQXl gene promoter 

which is activated in the presence of methanol. This 
promoter is highly expressed and tightly regulated (see 
e.g. the European Patent Application No. 85113737.2, 
published June 4, 1986, under No. 183 071). After 

10 identification and isolation of the AQXl regulatory 

elements, a methanol-responsive gene expression system 
has been developed in Pichia that places heterologous 
genes under the regulation of the AQXl promoter [Cregg 
et al . . Bio/Technology 5, 479 (1987) ]. Another key 

15 feature of the L. pastoris expression system is the 

stable integration of expression cassettes into the P. 
pastoris genome, thus significantly decreasing the chance 
of vector loss. 

Although P^_ pastoris has been used successfully 

2SL for the production of various heterologous proteins, 

e.g., hepatitis B surface antigen [Gregg et al. . Supra ] , 
bovine lysozyme [Digan et al . , Developments in Industrial 
Microbiology 29, 59 (1988) ; Digan et al . / Bio/Technology 
2f 160 (1989) , and S acchar omvces cerevisiae invertase 

25 [Tschopp et al . , Bio/Technology 5, 1305 (1987)], 

endeavors to produce other heterologous gene products in 
Pichia , especially by secretion, have given mixed results 
and, in some cases, have been unsuccessful. At our 
present level of understanding of the P. pastoris 

J30 expression system, it is unpredictable whether a given 

gene can be expressed to an appreciable level in this 
yeast, whether the expression yields a product that is 
stable under ordinary fermentation conditions and 
subsequent processing, or whether Pichia will tolerate 

35. the presence of the recombinant gene product in its 

cells. Further, it is especially difficult to foresee if 
a particular protein will be secreted by JP^ pastoris f and 
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if it is, at what efficiency. Even for cereyisiae, 
which has been considerably more extensively studied than 
P, pastor is , the mechanism of protein secretion is not 
well defined and understood, 

5 

Rninmary of the Invention 

The present invention relates to the production 
of a secreted soluble form of CD4 protein, containing the 
site of interaction between CD4 and the human 
0. immunodeficiency virus (HIV) in Pigfrja pastoris? 

(JP_=_ pastoris) 

In one aspect, the present invention relates to 
a pastoris yeast cell containing in its genome at 
least one copy of a DNA sequence operably encoding in 
3 2^ pastoris at least a portion of human CD4 glycoprotein, 

containing the site of interaction between CD4 and HIV, 
in operational association with a DNA sequence encoding a 
signal sequence which functions to direct secretion of 
the encoded glycoprotein in 2±. pastoris , both under the 

20 regulation of a promoter region of a pastpris gene. 

The signal sequence of the L. ceyeyjsjae alpha-mating 
■, factor (AMF) gene (AMF pre-pro sequence) is a preferred 
signal sequence. tj 

In another aspect, the present invention 

25 concerns a DNA fragment which may be contained within, or 

may itself be, a circular plasmid, and which comprises at 
least one copy of an expression cassette comprising in 
the direction of transcription, a promoter region of a 
first p- pastoris gene, a DNA sequence encoding in 

30 p. pastoris at least a portion of human CD4 glycoprotein 

containing the site of interaction between CD4 and the 
HIV virus, preceeded by a DNA sequence encoding a signal 
sequence directing the secretion of said glycoprotein or 
a portion thereof in Hs_ pastoris . and a transcription 

35 terminator of a second pastor*? gene, said first and 

second P^. pastoris genes being identical or different, 
and the segments of said expression cassette being in 
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operational association. The DNA sequence preceeding the 
CD4 glycoprotein gene preferably is a DNA sequence 
encoding the cerevisiae AMF pre-pro sequence followed 
by a DNA sequence encoding AMF processing site lys-arg. 
5. Expression vectors containing such DNA 

sequences are also within the scope of the invention. 

In a further aspept, the invention relates to a 
process for producing and secreting at least a portion of 
human CD4 glycoprotein , containing the site of 

10 interaction between CD4 and the virus HIV, into the 

culture medium. According to this process, pastoris 
transf ormants containing in their genome at least one 
copy of a DNA sequence operably encoding in pastoris 
at least a portion of human CD4 glycoprotein, containing 

15 the site of interaction between CD4 and the virus HIV, 

in operational association with a DNA sequence encoding a 
signal sequence which functions to direct secretion of 
the encoded CD4 or CD4 portion in JPj. pastoris (the 
S. cerevisiae AMF pre-pro sequence being preferred) , both 

20 under the regulation of a promoter region of a 

P. pastoris gene, are grown under conditions allowing the 
expression of the DNA sequences in pastoris and 
secretion of the CD4 glycoprotein into the culture medium 
in a substantially pure form devoid of degradation 

£5 products. 

Brief Description of Drawings 

Figure 1 shows the nucleotide sequence and 
amino acid sequence of the S^. cerevisiae alpha-mating 
30 factor (AMF) pre-pro gene segment. 

Figure 2 shows the nucleotide sequence and the 
deduced amino acid sequence of a 482 bp DNA fragment 
encoding amino acids 1 - 106 of mature CD4 along with its 
leader sequence. 
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Figure 3 illustrates the construction of the 
pjinhja pastoris expression vector, pSCD103 for the 
. production of human 004^ . 

Figure 4 shows the nucleotide and amino acid 
5 sequence of the EcoRI insert of pSCD103. 

Figure 5 is a restriction map of plasmid pA0815 
Figure 6 shows the cell wet weight over time 
for fermentation Runs 568, 570, and 571. 

Figure 7 shows the time course of fermentation 
10 Run 593 of the two-copy Mut+ strain, G+SGD103S16. 

A. Cell density (grams of wet weight/liter) , 
plotted against time of fermentation. 

B. Recombinant human CD4-V 1 product iori 
(mg/liter of cell-free fermentor broth) 

15 for the fermentation presented! in Figure 

7A is plotted against time. ^he 
expression level was determined by the 
quantitative Western blot assay. 
, Figure 8 is a silver-stained gel, using 

20 reducing conditions of V, standard and pj,chia pastorjs 

fermentor broth. Lanes 1-4 (numbered consecutively from 
left to right) contain 100, 200, 300 or .400 ng V, 
standard, respectively. Lane 6 is 7.5 Ml of G+SCD103S16 
fermentor broth; lanes 7 and 8 are 5 jil of G-PAQ815 

25 fermentor broth containing 100 ng or 200 ng V, standard, 

respectively. Lane 10 is 3.75 /xl of G+SCDlp3S16 broth 
with 3.75 /il of G-PA0815 broth; lanes 11 and 12 are 3.75 
Ml and 7.5 Ml of G-PA0815 broth, alone. Lane 13 is. pre-, 
stained molecular size standards obtained from 

30 Diversified Biotech, Newton Centre, MA. They are Low 

Range Standards #SDS-100P and are: trypsin inhibitor, 
20,400; myoglobin, 16,949; CNBr cleavage fragments of 
myoglobin, Fragment IV, 14,404; Fragment III, 8,159; 
Fragment II, 6,214; Fragment I 2,152. These standards 

35 are included to denote relative positions of sample bands 

on the gel and have not been used to estimate molecular 
mass. Lanes 5 and 9 contain broth from a fermentation 
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run of strain G+SCD103S16 using slightly different 
conditions than those reported in the Examples 

Figure 9 shows the results of N- terminal 
sequence analysis of rCD4-V 1 produced in Pichia pastoris 
J5 compared with the published N-terminal sequence for 

mature human CD4 . Pichia rCD4-V 1 was isolated from a gel 
similar to the one in Figure 8, and sequenced as 
described in Example 4b. 

Figure 10 shows the result of stability test 

AO performed on rCD4-V 1 produced in Pichia pastoris . One 

hundred microliter samples of G+SCD103S16 broth were 
treated under the following conditions , before 10 pi was 
separated on SDS-PAGE and subjected to * immunob lotting 
with polyclonal sCD4 antibody. All broth samples were 

15 frozen immediately upon removal from the fermentor. The 

sample in lane 1 was thawed just prior to SDS-PAGE. 
Lanes 2-4 contain samples at pH 2.5, lanes 5-7 contain 
samples that had been adjusted to pH 5.0 upon thawing. 
Samples in: lanes 2 and 5 were thawed and immediately 

2SL refrozen, lanes 3 and 6 were thawed and incubated at 4* 

for 20 hours, lanes 4 and 7 were thawed and incubated at 
30 *C for 20 hours. L coli v 1 (100 ng) was included as a 
size standard. 

Figure 11 shows the nucleotide sequence of the 

£5 human CD4 cDNA and the translated sequence of the CD4 

protein. 

Detailed Description of the Invention 
1* Definitions 
10 An expression system suitable for the 

production and secretion of at least a portion of human 
CD4 glycoprotein, containing the site of interaction 
between CD4 and the human immunodeficiency virus HIV is 
provided. Preferably, this portion is the V 1 region of 
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It will be understood that there is some 
uncertainty in the literature as to the definition of the 
"first variable region" (V, region) of CD4 , often 
referred to as ""CDA-y,". Studies using recombinant HIV 
5_ gpl20 demonstrate that the determinants for high affinity 

binding lie solely within the first 106 amino acids of 
CD4. However, the "recombinant V," produced in soli 
contains the first 113 N-terminal amino acids of mature 
human CD4 . The terms "first variable region", "V," or "V, 

lfi region" and synonymous expressions, alone or in 

combination with other terms, are used throughout the 
specification and claims to refer to a DNA sequence 
including at least the first 106 N-terminal amino acids 
of mature human CD4 (as shown in Figure 11). However, 

15 polypeptides deficient in one or more amino acids in the 

amino acid sequence reported in the literature, or 
polypeptides containing additional amino acids, or 
polypeptides in which one or more amino acids in the 
amino acid sequence of the V, region of CD4 are replaced 

2Q by other amino acids, are within the scope of the 

definition used herein, provided that they exhibit the 
functional activity of V,, in particular preserve its HIV- 
binding properties. The definition used in connection 
with the present invention is intended to embrace all the 

25 allelic variations of V r Moreover, as noted Supra / 

derivatives obtained by simple modification of the amino 
acid sequence of the naturally occurring product t ®.g« by 
way of site-directed mutagenesis or other standard 
procedures are included. 

30 The term "at least a portion of human CD4 

■■. glycoprotein, containing the site of interaction between 
CD4 and the human immunodeficiency virus HIV" and 
synonymous expressions, as used herein, refer to the 
full-length mature human CD4 glycoprotein molecule or any 

35 portion thereof capable of binding the HIV virus. Just 

as in the case of the V, region of CD4 , polypeptides - 
def icient in one or more amino acids in the correct amino 
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acid sequence reported in the literature for mature human 
naturally occurring CD4 . or its respective regions , or 
polypeptides containing additional amino acids, or 
polypeptides in which one or more amino acids in the 
5 amino acid sequence of CD4 or its respective regions are 

replaced by other amino acids , are within the scope of 
the definition used herein, provided that they exhibit 
the functional activity of CD4, in particular preserve 
its HIV-binding properties. The definition used in 

10 connection with the present invention is intended to 

embrace all the allelic variations of CD4 . Moreover, as 
noted Supra . derivatives obtained by simple modification 
of the amino acid sequence of the naturally occurring 
protein, e.g. by way of site-directed mutagenesis or 

i5 other standard procedures are included. 

The term "amino acid sequence operably encoding 
in Pichia pastoris at least a portion of human CD4 
glycoprotein, containing the site of interaction between 
CD4 and the human immunodeficiency virus HIV" and 

20 grammatical variations thereof, as used herein, refers to 

DNA sequences encoding in Pichia pastoris "at least a 
portion of human CD4 glycoprotein, containing the site of 
interaction between CD4 and the human immunodeficiency 
virus HIV" such as the "first variable (V,) region", as 

2!5 hereinabove defined. This sequence contains but is not 

restricted to, the DNA sequence encoding residues 16 
through 84 of the mature CD4 protein, which are contained 
within the first disul fide-bonded, covalently closed 
peptide loop of CD4. Such sequences may be obtained by 

.30 chemical synthesis or by transcription of a messenger RNA 

(ulRNA) corresponding to CD4 or a portion thereof to a 
complementary DNA (cDNA) and converting the latter into a 
double stranded cDNA. Additionally, the CD4 sequences 
may be obtained through the use of polymerase chain 

35. reaction (PCR) on genomic DNA encoding at least the V y 

region. The mRNA can be isolated for example, from T4* 
transformed fibroplasts as described by Maddon et al. . 
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Sur>ra (1985). Chemical synthesis of a gene for human CD4 
or a portion thereof is, for example, disclosed by 
Jameson et al. . science 240, 1335 (1988) ; Litsen et al. > 
science 241 . 712 (1988). The requisite DNA sequence can 
£ also be removed, for example, by restriction enzyme 
digest of known vectors harboring the desired gene. 
Examples of such vectors and the means for their 
preparation can be taken from the following publications: 
Smith et al . . Supra ; Fisher et al. . Supra; Hussey et alt , 

10 supra , Deen et al . . Supra, Traunecker et alt , Supra, etc. 

According to Example 1 of the present application, a 482 
bp DNA fragment encoding the V, portion of human CD4 was 
excised from a 2.2 kb linear figlII-Njie.1 DNA fragment by 
digestion with EcqRI and JCbal. However, the CD4-V, 

15 encoding DNA fragment can be removed from other known DNA 

fragments as well. For example, a 682 bp ESoRI-NJifil 
fragment from pT4B is disclosed in Maddon et a.1.,,., Call 
42. 93 (1985) . The Vj-encoding sequence can be readily 
obtained from this fragment by digestion with EsqRI and 

20 Xbal . 

The amino acids, which occur in the various 
amino acid sequences referred to in the specif ication 
have their usual, three- and one-letter abbreviations, 
routinely used in the art, i.e.: 
is Amino Acid Abbrevjaticn 

L-Alanine Ala A 

L-Arginine Arg R 

L-Asparagine Asn N 

L-Aspartic acid Asp D 

30 L-Cysteine Cys C 

L-Glutamine Gin Q 

L-Glutamic Acid Glu E 

L-Glycine Gly G 

L-^Histidine His H 

35 L-Isoleucine lie I 

L-Leucine Leu L 

L-Lysine Lys K 
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U-Methi onine 


Met 


M 


L-Phenylalanine , 


Phe 


F 


L- Proline 


Pro 


P 


L-Serine 


Ser 


S 


L-Threonine 


Thr 


T 


L-Tryptophan 


Trp 


W 


L-Tyrosine 


Tyr 


Y 


L-Valine 


Val 


V 



AO The promoter region employed to drive the 

expression of a gene encoding at least a portion of CD4, 
preferably CD4-V,, is derived from a methanol-regulated 
alcohol oxidase gene of pastoris . p t pastoris is 
known to contain two functional alcohol oxidase genes: 

A5 alcohol oxidase I (AOX1) and alcohol oxidase II ( AOX2 > 

genes. The coding portions of the two AOX genes are 
closely homologous at the DNA and predicted amino acid 
sequence levels and share common restriction sites. The 
proteins expressed from the two genes have similar 

2SL enzymatic properties but the promoter of the AOX1 gene is 

more efficient and highly expressed, therefore, its use 
is preferred for heterologous expression. The AOX1 gene, 
including its promoter, has been isolated and thoroughly 
characterized [Ellis et al. . Mol. Cell. Biol. 1111 

« (1985)]. 

The expression cassette used for transforming 
P*- pastoris cells contains, in addition to the 
£i. pastoris promoter and the CD4 (CD4-V y ) encoding DNA 
sequence, a DNA sequence encoding a signal sequence 

IflL directing the secretion of the CD4 glycoprotein or a 

portion thereof in pastoris r preferably a DNA encoding 
the in-reading frame cerevisiae AMF pre -pro sequence, 
and a DNA sequence encoding AMF processing site, lys-arg 
(also referred to as lys-arg encoding sequence) and a 

35 £^ pastoris transcription terminator. Although the 

cerevisiae AMF pre-pro sequence is preferred, other 
signal sequences suitable for directing foreign protein 
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secretion in p- pastoris may also be used. Such 
sequences are, for example, the Saccharomyces cerevjsjae 
invertase signal sequence. 

The lb. ppTBvisiae alpha-mating factor is a 13- 
5 residue peptide, secreted by cells of the "alpha" mating 

type, that acts on cells of the opposite "a" mating type 
to promote efficient conjugation between the two cell 
types and thereby formation of "a-alpha" diploid cells 
[Thorner et al. . The Mole cular Biolocrv the Yeast 

io Rnreharomvces , Cold Spring Harbor Laboratory, Cold Spring 

Harbor, NY, 143 (1981)]. The AMF pre-pro sequence is a 
leader sequence contained in the AMF precursor molecule, 
which, together with the lys-arg encoding sequence is 
necessary for proteolytic processing and 'secretion (see 

15 e.g. Brake et al . . Supra ). The AMF pre-pro sequence, 

including the lys-arg encoding sequence is a 255 bp 
fragment which is illustrated in Figure 1. 

The L. pastoris transcription terminator used 
in accordance with the present invention has a subsegment 

20 which encodes a polyadenylation signal and 

pblyadenylation site in the transcript and/or a 
subsegment which provides a transcription termination 

signal for transcription from the promoter used in the 
expression cassette according to the invention (the term 

25 "expression cassette" as used herein and throughout the 

specification and claims refers to a DNA sequence which 
includes sequences functional for expression and the 
secretion processes) . The entire transcription 
terminator is taken from a pastoris protein-encoding 

10 gene, which may be the" same or different from the 

P. pastoris gene which is the source of the £^ pas£orJLg 
promoter used according to the invention. 

The DNA fragments according to the invention 
further comprise a selectable marker gene. For this 

35 purpose, any selectable marker gene functional in 

P. pastoris may be employed, i.e. , any gene which confers 
a phenotype upon E*. pastoris cells thereby allowing them 
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to be identified and selectively grown from among a vast 
majority of untransf ormed cells. Suitable selectable 
marker genes include, for example, selectable marker 
systems composed of an auxotrophic mutant P± pastoris 
5 host strain and a wild type biosynthetic gene which 

complements the host's defect. For transformation of 
his4" L. pastoris strains, for example, the S. cerevisiae 
or P. pastoris HIS4 gene, or for transformation of arg4" 
mutants the cerevisiae ARG4 gene or the pastoris 

10 ARG4 gene, may be employed. 

The term "expression vector" includes vectors 
capable of expressing DNA sequences contained therein, 
where such sequences are in operational association with 
other sequences capable of effecting their expression, 

15 i.e. promoter sequences. In general, expression vectors 

usually used in recombinant DNA technology are often in 
the form of "plasmids", i.e. circular, double-stranded 
DNA loops which, in their vector form, are not bound to 
the chromosome. In the present specification the terms 

2& "vector" and "plasmid" are used interchangeably. 

However, the invention is intended to include other forms 
of expression vectors as well, which function 
equivalently. 

In the DNA fragment according to the invention 

25 the segments of the expression cassette are "in 

operational association" . The DNA sequence encoding CD4 
or any portion thereof as hereinabove defined, is 
positioned and oriented functionally with respect to the 
promoter, the DNA encoding a signal sequence, preferably 

30 the £Ls- cerevisiae AMF pre-pro sequence, and the DNA 

sequence encoding AMF processing-site, lys-arg and the 
transcription terminator, so that the polypeptide 
encoding segment is transcribed, under regulation of the 
promoter region, into a transcript capable of providing, 

35 upon translation the desired polypeptide in £^ pastoris . 

Because of the presence of the signal sequence, e.g. the 
AMF pre-pro sequence, the expressed product, CD4 pr a 
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portion thereof , as hereinabove defined/ is found as a 
secreted entity in the culture medium, properly processed 
away from the AMF pre-pro sequence. Appropriate reading 
frame positioning and orientation of the various segments 
5 of the expression cassette are within the knowledge of 

persons of ordinary skill in the art; further details are 
given in the Examples. 

The DNA fragment provided by the present 
invention may include sequences allowing for its 
10 replication and selection in bacteria, especially 

coli . m this way, large quantities of the DNA fragment 
can be produced by replication in bacteria. 

The term "culture 11 means a propagation of cells 
in a medium conductive to their growth, and all sub- 
15 cultures thereof. The term "subculture" refers to a 

culture of cells grown from cells of another culture 
(source culture) , or any subculture of the source 
culture, regardless of the number of subculturings which 
have been performed between the subculture of interest 
2 0 and the source culture. 

The following abbreviations are used throughout 
the Examples with the following meanings: 
DTT dithiothreitol 
* SDS sodium dodecyl sulfate 

25 PBS phosphate buffered saline 

Mut* methanol utilization competent 

Muf methanol utilization defective 

Tris-HCl: 1.5M at pH 8.8 and 0.5M at pH6.8. Both are 
stored at 4'C. 

30 The buffers and solutions, the composition of 

which is not specified in the Examples, are as follows: 
WBB: IX PBS, 0.05% Tween-20, 0.02% NaN 3 , 0.25% 

gelatin. 

Laemmli buffer: see Nature 227. 680.(1970) 
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2 . General Methods 

Methods of transforming Pichia pastoris as well 
as methods applicable for culturing JP^ pastoris cells 
containing in their genome a gene for a heterologous 
5 protein are known generally in the art. 

According to the invention, the expression 
cassettes are generally transformed into the P. pastoris 
cells by the whole-cell lithium chloride yeast 
transformation system [Ito et al v Acrric. Biol. Chem. 48 , 
10 341 (1984) ], with minor modification necessary for 

adaptation to £. pastoris . Alternatively, the 
spheroplast technique, described by Cregg et al., 

]Mol. Cell. Biol, 5, 3376 (1985) can also be used for the 
transformation of P^_ pastoris cells. The whole-cell 
15 lithium chloride method is more convenient in that it 

does not require the generation and maintenance of 
spheroplasts • 

Positive transformants are characterized by 
Southern blot analysis [ E. M. Southern, J, Mol. Biol. 
£fi 28, 503 (1975); Maniatis et al . r Molecular Cloning: A 

Laboratory Manual, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, New York, USA (1982) ] for the site of 
DNA integration, and Northern blots [Alwine et al. r Proc. 
.Natl. Acad. Sci. USA 24, 5350 (1977) ; Maniatis, Op. Cit. ] 
2J> for methanol-responsive heterologous gene expression. 

Total RNA for Northern blot analysis was prepared 
essentially as described by Zitomer et al v 
J. Biol. Chem. 251 . 6320 (1976) 

Nick translation can be performed according to 
10 Meinkoth et_al^, Methods in EnzvmolociY 155 r 91 (1987) . 

•Transformed strains, which are of the desired 
phenotype and genotype are grown in fermentors. For the 
large-scale production of recombinant DNA-based products 
in pastorjs a three-stage, high cell-density, batch 
3J5 fermentation system is normally employed. In the first, 

or growth stage expression hosts are cultured in defined 
minimal medium with excess glycerol as. carbon source • On 
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this carbon source heterologous gene expression is 
completely repressed, which allows the generation of cell 
mass in the absence of heterologous protein expression. 
Next, a period of glycerol limitation growth is allowed 
5 to further increase cell density. Subsequent to the 

glycerol limited growth, methanol is added, initiating 
the expression of the desired heterologous protein. This 
third stage is the so-called production stage. The 
fermentation of CD4-V, essentially followed the three- 
10 stage protocol described in Digan et ai:. Bio/Technology 

2f I 60 (1989)* However, as shown in the Examples and in 
the description of preferred embodiments, in order to 
obtain a stable product, the pH had to be maintained at a 
lower level than usual for Pichia pastoris fermentations. 

a. Description of Prefe rred Embodiments 

According to a preferred embodiment of the 
present invention, the V, region of the human CD4 molecule 
is produced in Pichia pastoris . This V, region contains a 

20 single disulfide bond between two cysteine residues, 

which are located near the N- and C-terraini, 
respectively . 

The heterologous protein expression system used 
for CD4-V, production preferably utilizes the promoter 

25 derived from the methanol-regulated A0X1 gene of 

P. pastoris . which is very efficiently expressed and 
tightly regulated. This gene is the source of the 
transcription terminator as well. Tlie expression 
cassette preferably comprises, in operational 

aP, association, a pastoris promoter, DNA encoding 

the Sj. cerevisiae AMF pre-pro sequence, a DNA sequence 
encoding AMF processing site, lys-arg, a DNA sequence 
encoding the molecule, and a transcription 

terminator derived from the P_s_ pastoris A0X1 gene. 

35 The host cells to be transformed with a linear 

vector comprising the expression cassette are pastoris 
cells having at least one mutation that can be 
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complemented with a marker gene present on a transforming 
DNA fragment. Preferably his4" (GS115) auxotrophic mutant 
P. pastoris strains are employed. 

The expression cassette is inserted into a 
5. plasmid containing a marker gene complementing the host's 

defect. pBR322-based plasmids, e.g. pA0815 are 
preferred. Plasmid pA0815 comprising the CD4-V, 
expression/secretion cassette is called pSCD103. The 
construction of this plasmid is disclosed in 
10 Example 1. 

To develop expression strains, the expression 
cassette is preferably integrated into the host genome 
by means of the homologous sequences present on the 
transforming DNA. The expression cassette or entire 

15 vector is integrated into the host genome by a one-step 

gene replacement or addition technique. This approach 
avoids the problems of plasmid instability. As a result 
of gene replacement Mut" strains are obtained. Mut refers 
to the methanol-utilization phenotype. In Mut" strains, 

20. the AOX1 gene is replaced with the expression cassette, 

thus decreasing the trans formant's ability to utilize 
methanol. A slow growth rate on methanol is maintained 
by expression of the A0X2 gene product. The 
; transformants in which the expression cassette has 

25 integrated into the A0X1 locus by site-directed 

recombination can be identified by first screening for 
the presence of the complementing gene. This is 
preferably accomplished by growing the cells in a media 
lacking the complementing gene product and identifying 

30. those cells which are able to grow by nature of 

expression of the complementing gene. Next, the selected 
cells are screened for their Mut phenotype by growing 
them in the presence of methanol and monitoring their 
growth rate. 

35. To develop Mut* strains, the expression cassette 

preferably is integrated into the host genome by 
transformation of the GS115 host with Sacl linearized 
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plasmid pSCD103 comprised of the V t expression cassette. 
The integration is by addition at a locus or loci having 
homology with one or more sequences present on the 
transformation vector. 
5 Positive transformants are characterized by 

Southern analysis for the site of DNA integration > by 
Northern analysis for methanol-responsive CD4-V, gene 
expression, and by immunoblot product analysis for the 
presence of secreted CDt-V^ in the growth media. 

jlo P. pastoris strains which have integrated one or multiple 

copies of plasmid at a desired site are identified by 
Southern blot analysis. Strains which demonstrate 
enhanced expression of the heterologous gene may be 
identified by Northern analysis , and enhanced secretion 

15 of the recombinant protein by product analysis. 

For Hut' strains the CD4-V, production 1 ey els 
were found to be somewhat lower than for Mut + strains , but 
the difference was not very significant. Hut* pjisfrcpr&s 
strains integrating multiple copies of the expression 

20 vector (or of the AMF-V 1 expression cassette) used for 

transformation, at the A0X1 locus are preferred, since an 
increase in copy number often increases productivity. 

p, pastoris transformants which are identified 
to have the desired genotype and phenotype are grown in 

25 fermentors. Typically a three-step production process is 

used. Initially, cells are grown on a repressing carbon 
source, preferably excess glycerol. In this stage the 
cell mass is generated in absence of expression* Next, a 
period of glycerol limitation growth is allowed, and then 

30 a limiting methanol feed is initiated, resulting in the 

expression of the V, gene driven by the ftOXl promoter. 
It has been found that in the usual pH-range 
used for heterologous protein production in £^ pqstQrjtfi 
(about pH 5.0) the V, product suffers a substantial 

35 proteolytic degradation. To avoid or, at least, reduce 

product degradation, fermentation is preferably performed 
at pHs below about 3.5, preferably between about pH 2.5 
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and 3.5, more preferably between about pH 2.5 and 3.0, 
for example at about pH 2.6. The pH can be adjusted to 
the desired value by methods known in the art, preferably 
before the induction of V 1 production. 
j> The level of CD4-VJ secreted into the media can, 

for example, be determined by quantitative Western blot 
analysis of the media in parallel with a standard (e.g. 
an coli produced V t standard), using reducing or non- 
reducing conditions. 
10 The invention is further illustrated by the 

following non-limiting examples. 

Example 1 

15 Vector construction 

I. Constr uction of the expression vector PSCD103 

The expression vector construction disclosed in 

the present application was performed using standard 

procedures, as described, for example in Maniatis et al. , 
20 Supra, arid Davis et al . , Basic Methods in Molecular 

Biology , Elsevier Science Publishing, Inc. , New York 

(1986) . 

A 2.2 kb linear Bal ll- Nhe l DNA fragment 
containing a segment encoding the V, portion of human CD4 

2J> accompanied by flanking DNA from coli , was obtained 

from Smith Kline & French Laboratories (U.S.A.) The V 1 - 
encoding sequence was excised from this fragment by 
digestion with EcoRI and Xba l , and isolating the 482 bp 
fragment (the sequence encoding aminb acids 1-106 of 

3£ mature CD4 along with its leader sequence) on a 1.3% 

agarose gel (Figure 2). Fifty nanograms of the 482 bp 
fragment were ligated to 100 ng of the plasmid pIBI25, 
previously cut with gcoRI and Xba l. Plasmid pXBI25 was 
purchased from IBI, New Haven, CT, and contains an fl 

35 origin of replication and the T7 promoter. E. coli 

MC1061 cells [M.J. Casadaban and S.N. Cohen, J^. Mol. 
Biol. 38. 179 (1980)] were transformed with ligation 
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products and amp* colonies were selected. Correct plasmid 
demonstrated a 477bp band upon digestion with EcoRI and 
Xba l and was called pSCD4. 

The AMF pre-pro sequence was isolated from 
5. M13mpl9aMF pre-pro by digesting with EcpJBX and PamHI and 
isolating the about 267 bp fragment on a 1.3% agarose 
gel. To prepare plasmid M13mpl9aMF, 15 Mg of plasmid 
pAO208 (the construction of which is described 
hereinafter) were digested with HinflIII, filled in with 

AO Klenow-fragment DNA polymerase; and digested with EcoRI. 

The digestion was run on a 1.7% agarose gel and the 267 
bp fragment comprised of the AMF pre-pro sequence was 
isolated. The hEGF (human epidermal growth factor) gene 
and the AMF pre-pro sequence in the same translational 

15 direction were inserted into M13mpl9, (New England 

Biolabs) , by the following procedure: y 
10 ng of M13mpl9 were digested with §nal an &7 
Eco RI and the large, about 7240 bp plasmid fragment was 
isolated on a 0.8% agarose gel. The plasmid fragment and 

20 the 267 bp AMF fragment were ligated together by T4 DNA 

ligase. The ligation mixture was transformed into JM103 
cells and DNA from the plaque was characterized. The 
correct plasmid was called M13mpl9aMF. 

Twenty five nanograms of the £coRI-£amHI 

25 fragment of M13mpl9aMF pre-pro were ligated to 100 ng of 

pIBI25 previously cut with gcoRI and gagHI, and the 
ligation products were transformed into MC1061, cells. 
Amp* colonies were selected and the correct plasmid was 
identified by digestion with SsfiRI and fiaiaHI. The 

30 correct plasmid demonstrated a 260 bp band, and was 

called pAMFlOl (Figure 3). 

pSCD4 was digested with EcoR I . made blunt-ended 
by treatment with Klenow fragment of E_s. coli DNA 
polymerase I, and then digested with 2CbaI. The 477 bp V,- 

35 encoding fragment was isolated on a 1.2% agarose gel. 

pAMFlOl was likewise digested with EamHT, treated" with 
Klenow fragment of Ij. call DNA polymerase I, digested 
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with Xba l f and then dephosphorylated. Fifty nanograms of 
the 477 bp V^encoding fragment were ligated to 100 ng of 
the linearized vector, and the ligation was transformed 
into E^. fioii CJ236 cells (BioRad, Richmond, CA; Muta-gene 
5. mutagenesis kit, # 170-3571). Amp* colonies were 

selected. The correct plasmid exhibited a 740 bp band 
upon digestion with EcoRI and Xba l and was called pSCDIOl 
(Figure 3) . 

Mutagenesis was performed to ; fuse the AMF pre- 
10 pro sequences directly to the V, coding region; the STE2 

processing sites (glu-ala-glu-ala) of the AMF pre-pro 
sequence and the native CD4 leader sequence were 
eliminated by the oligonucleotide-directed mutagenesis. 
Single-stranded pSCDIOl template was prepared following 
15 the procedure of Russel et al. f Gene 45 , 333 (1986), 

using the helper phage R408. The mutagenizing and 
screening oligonucleotide was of the following sequence: 
5 1 GGG TAT CTT TGG ATA AAA GAA AGA AAG TGG 
TGC TGG GCA A3 1 

SO. Mutagenesis reaction products were transformed 

into MCi061 cells;, colonies transformed with the 
mutagenized plasmid were first identified by 
hybridization with the screening oligonucleotide, and 
then the correct mutagenesis was confirmed by sequencing. 

25 The correctly mutagenized plasmid was called pSCD102. 

An EcoRI linker was added to the 3 1 end of the 
AMF pre-pro-Vj insert by digesting pSCD102 with Xba l . 
blunt-ending with E*. coli DNA polymerase I Klenow 
fragment, and iigating 100 ng of the vector to 15 ng of 

35. EcoR I linkers having the sequence: 

5 1 GGAATTCC 3 1 

The ligation products were digested with EcoRI and the 
560 bp AMF pre-pro-V, fragment was isolated on a 1.2% 
agarose gel. Twenty nanograms of the 560 bp fragment 
35 were then ligated to 100 ng of JEcoRI-digested and 

phosphatase-treated pA0815 (the construction of which is 
described hereinbelow) . The ligation products were 
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transformed into MC1061 cells and the amp* colonies were 
selected. The correct plasmid demonstrated a 1675 bp 
band upon digestion with Pst l. and was called pSCD103 
(Figure 3) . The nucleotide and amino acid sequence of 
5 the Eco RI insert of pSCD103 is shown in Figure 4. 

JT. Construction of plasmid PAO2 08: 

The AOX1 transcription terminator was isolated 
from 20 M9 of pPG2.0 [pPG2.0 - BainHI^Iiindlll fragment of 

10 pG4.Q (NRRL 15868) + pBR322] by StijI digestion followed 

by the addition of 0.2 /zg Sai l linkers (GGTCGACC) • The 
plasmid was subsequently digested with Mndlll and the 
350 bp fragment isolated from a 10% acrylamide gel and 
subcloned into pUC18 (Boehringer Mannheim) digested with 

15 Hindlll and Sail. The ligation mix was transformed into 

JM103 cells (that are widely available) and amp* colonies 
were selected. The correct construction was verified by 
Hind lll and Sai l digestion, which yielded a 350 bp 
fragment, and was called pA0201. 

20 5 /xg of pA0201 was digested with fiinclIII# 

filled in using coli DNA Polymerase I Kl enow fragment, 
and 0.1 ng of Bgill linkers (GAGATCTC) were added. After 
digestion of the excess BgLlII linkers, the plasmid was 
reclosed and transformed into MC1061 cells. Amp R cells 

25 were selected, DNA was prepared, and the correct plasmid 

was verified by Bgill, Sai l double digests, yielding a 
350 bp fragment, and by a Einc|III digest to show loss of 
Hind lll site. This plasmid was called pAO202. 

The alpha f actor-GRF fusion was isolated as a 

30 360 bp fiainHI-Pstl partial digest from pYSV201. Plasmid 

pYSV201 is the EcoRI -BaiaHI fragment of GRF-E-3 inserted 
into M13mpl8 (New England Biolabs) . Plasmid GRF-E-3 is 
described in EP 206,783. 20 M9 of pYSV201 plasmid was / 
digested with BamHI and partially digested with £s£I. To 

35 this partial digest was added the following 

oligonucleotides: 
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5 1 AATTCGATGAGATTTCCTTCAATTTTTACTGCA 3 1 
3 ' GCTACTCTAAAGGAAGTTAAAAATG 5 1 . 
Only the antisense strand of the oligonucleotide was 
kinase labelled so that the oligonucleotides did not 
5 polymerize at the 5' -end. After acrylamide gel 

electrophoresis (10%) , the fragment of 385 bp was 
isolated by electroelution. This \gcoRl- BamHI fragment 
of 385 bp was cloned into pA0202 which had been cut with 
EcoRI and BamHI . Routinely, 5 ng of vector cut with the 

10 appropriate enzymes and treated with calf intestine 

alkaline phosphatase, was ligated with 50 ng of the 
insert fragment. MC1061 cells were transformed, amp r 
cells were selected, and DNA was prepared. In this case, 
the resulting plasmid, pA0203, was cut with EcoR I and 

15 Bgail to yield a fragment of greater than 700 bp. The a- 

factor-GRF fragment codes for the (1-40) leu 27 version of 
GRF and contains the processing sites lys-arg-glu-ala- 
glu-ala. 

The AQX1 promoter was isolated as a 1900 bp 

20 EcoRI fragment from 20 /xg of pAOP3 and subcloned into 

EcoRI-digested pA02 03. The development of pAOP3 is 
disclosed in EP 226,846 and described hereinbelow, 
MC1061 cells were transformed with the ligation reaction, 
*amp r colonies were selected, and DNA was prepared. The 

25 correct orientation contains a «376 bp Hindi 1 1 fragment, 

whereas the wrong orientation has an »675 bp fragment. 
One such transformant was isolated and was called pA0204. 

The parent vector for pA0208 is the HIS4 , PARS 2 
plasmid pYJ32 (NRRL B-15891) which was modified to change 

30 the EcoRV site in the tet R gene to a BglH site, by 

digesting PYJ32 with Eco RV and adding Bal ll linkers to 
create pYJ32 (+£glll) . This plasmid was digested with 
Bglll and the 1.75 Kb Bal ll fragment from pA0204 
containing the A0X1 promoter-a factor GRF- AOX1 3 ■ 

3J. expression cassette was inserted. The resulting vector 

was called pA0208. The orientation was verified by an 
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Eco RI digest yielding an 850 bp fragment + vector, as 
opposed to 1.1 Kb + vector in the other orientation, 
a. Construction of plasmid PA0P3 ; 

1. Plasmid pPG2.5 [a pBR322 based plasmid 

5 containing the approximately 2.5 Kbp EcoRI-Sali fragment 

from plasmid pPG4.0, which plasmid contains the primary 
alcohol oxidase gene ( AQX1 ) and regulatory regions and 
which is available in an coli host from the Northern 
Regional Research Center of the United States Department 
10 of Agriculture in Peoria f Illinois as NRRL B-15868] was 

linearized with BamH I. 

2. The linearized plasmid was digested with 

BAL31; 

3. The resulting DNA was treated with iL_ coli 
15 DNA Polymerase I Klenow fragment to enhance blunt ends, 

and ligated to Eco RI linkers; 

4. The ligation products were transformed 
into EL*, coli strain MM294; 

5. Transformants were screened by the colony 
20 hybridization technique using a synthetic oligonucleotide 

having the following sequence: 

5 1 TTATTCGAAACGGGAATTCC • 
This oligonucleotide contains the AOX1 promoter sequence 
up to , but not including , the ATG initiation codon, fused 
25 to the sequence of the EcoRI linker; 

6. Positive clones were sequenced by the 
Maxam-Gilbert technique. All three positives had the 
following sequence: 

5 1 . . . TTATTCGAAACG&GGAATTCC . . • 3 1 . 

30 They all retained the "A" of the ATG (underlined in the 

above sequence) . It was decided that this A would 
probably not be detrimental; thus all subsequent clones 
are derivatives of these positive clones. These clones 
have been given the laboratory designation pAOPl, pAOP2 

35 and pA0P3 respectively. 
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III. Construction of plasmid p &nmn. 

Plasmid pA0815 was constructed by mutagenizing 
plasmid pA0807 (described hereinbelow) to change the Cla l 
site downstream of the A0X1 transcription terminator in 
PA08 07 to a fiajnHI site. The oligonucleotide used for 
mutagenizing pA0807 had the following sequence: 5' GAC 
GTT CGT TTG TGC GGA TCC AAT GCG GTA GTT TAT 3». The 
mutageni zed plasmid was called pA0807-Bam. Plasmid 
PA0804 was digested with Bglll and 25 ng of the 2400 bp 
fragment were ligated to 250 ng of the 5400 bp figlll 
fragment from Bglll-digested pA0807-Bam. The ligation 
mix was transformed into MC1061 cells and the correct 
construct was verified by digestion with Pst/BamH I to 
identify 5700 and 2100 bp sized bands. The correct 
construct was called pA0815. The restriction map of the 
expression vector pA0815 is shown in Figure 5. 

a_s. Plasmid pA0807 was constructed as follows: 

1. Preparation of fl-ori DNA 

fl bacteriophage DNA (50 nq) was digested with 
50 units of £sa I and p_ra I (according to manufacturer's 
directions) to release the »458 bp DNA fragment 
containing the fl origin of replication (ori) . The 
digestion mixture was extracted with an equal volume of 
phenol: chloroform (V/V) followed by extracting the 
aqueous layer with an equal volume of chloroform and 
finally the DNA in the aqueous phase was precipitated by 
adjusting the NaCl concentration to 0.2M and adding 2.5 
volumes of absolute ethanol. The mixture was allowed to 
stand on ice (4*C) for 10 minutes and the DNA precipitate 
was collected by centrifugation for 30 minutes at 10,000 
x g in a microfuge at 4'C. 

The DNA pellet was washed 2 times with 70% 
aqueous ethanol. The washed pellet was vacuum dried and 
dissolved in 25 j*l of TE buffer [1.0 mM EDTA in 0.01 M 
(PH7.4) Tris buffer]. This DNA was electrophoresed on 
1.5% agarose gel and the «458 bp fl-ori fragment was 
electroeluted onto DE81 (Whatman) paper and eluted from 
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the paper in 1M NaCl. The DNA solution was precipitated 
as detailed above and the DNA precipitate was dissolved 
in 25 /il of TE buffer (fl-ori fragment) . 

2. Cloning of fl-ori into Dra I sites of 

5 pBR322 

pBR322 (2 ftg) was partially digested with 2 
units Dra I (according to manufacturer's instructions). 
The reaction was terminated by phenol: chloroform 
extraction followed by precipitation of DNA as detailed 

10 in step 1 above. The DNA pellet was dissolved in 20 pi 

of TE buffer. About 100 ng of this DNA was liga ted with 
100 ng of fl-ori fragment (step 1) in 20 til of ligation 
buffer by incubating at 14 *C for overnight with 1 unit of 
T4 DNA ligase. The ligation was terminated by heating to 

15 70 *C for 10 minutes and then used to transform iU. coli 

strain JM103 [ Janisch-Perron et al. , Gene 22* 1.03 
(1983)]. Amp R transf ormants were pooled and 
super infected with helper phage R408 [Russel et al. . 
Supra]. Single stranded phages were isolated from the 

20 media and used to reinfect JM103. Amp* transf ormants 

contained pBRfl-ori 

which contains fl-ori cloned into the Dra I sites 
(nucleotide positions 3232 and 3251) of pBR322. 

£5 3* Construction of plasmid pA0807 

pBRfl-ori (10 ng) was digested for 4 hours at 
37*C with 10 units each of Pst 1 and Nde I. The digested 
DNA was phenol : chloroform extracted, precipitated and 
dissolved in 25 pi of TE buffer as detailed in step 1 

30 above. This material was electrophoresed on a 1.2% 

agarose gel and the Nde I - Pst I fragment (approximately 
0.8 kb) containing the fl-ori was isolated and dissolved 
in 20 Ml of TE buffer as detailed in step 1 above. About 
100 ng of this DNA was mixed with 100 ng of pA0804 

35 (described hereinafter) that had been digested with Pst I 

and Nde I and phosphatase-treated. This mixture was 
ligated in 20 /il of ligation buffer by incubating for 
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overnight at 14 # C with 1 unit of T4 DNA ligase. The 
ligation reaction was terminated by heating at 70 # C for 
10 minutes. This DNA was used to transform E. coli 
strain JMi03 to obtain pA0807. 
5 b^. Plasmid pA0804 employed in the above 

procedure was constructed as follows: 

Plasmid pBR322 was modified as follows to 
eliminate the EcoRI site and insert a Bal ll site into the 
Pvu II site: 

Afi. pBR322 was digested with EcoRI, the protruding 

ends were filled in with Kl enow Fragment of ^ coli DNA 
polymerase I, and the resulting DNA was recircularized 
using T4 ligase. The recircularized DNA was used to 
transform E. coli MC1061 to ampicillin-resistance and 

15 trans f ormant s were screened for having a plasmid of about 

4.37 kbp in size without an EcoR I site. One such 
transf ormant was selected and cultured to yield a 
plasmid, designated pBR322aRI, which is pBR322 with the 
EcoRI site replaced with the sequence: 

£0 5 ' -GAATTAATTC-3 ' 

3 1 -CTTAATTAAG-5 1 . 

PBR322aRI was digested with Pvu II and the 
linker, of sequence 

25 5 1 — CAGATCTG— 3 1 

3 1 — GTCTAGAC— 5 1 
was ligated to the resulting blunt ends employing T4 
ligase. the resulting DNAs were recircularized, also 
with T4 ligase, and then digested with Bglll and again 

30 recircularized using T4 ligase to eliminate multiple 

Bgl ll sitds due to ligation of more than one linker to 
the PvuII-cleaved pBR322ARI. The DNAs, treated to 
eliminate multiple figill sites, were used to transform E. 
coli MC1061 to ampicillin-resistance. Transf ormants were 

3i> screened for a plasmid of about 4.38 kbp with a Bglll 

site* One such transf ormant was selected and cultured to 
yield a plasmid, designated pBR322 aRIBGL, for further 
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work, Plasmid pBR322ARIBGL is the same as pBft322ARI 
except that pBR3 2 2 aRI BGL has the sequence . 
5 1 -CAGCAGATCTGCTG-3 1 " 
3 1 -GTCGTCTAGACGAC-5 » 
5 : in place of the PvuII site in pBR322ARI. 

pBR3 22ARIBGL was digested with a Sai l and Bal l I 
and the large fragment (approximately 2.97 kbp) was 
isolated. Plasmid pBSAGISI, which is described in 
European Patent Application Publication No. 0,226,752, 

10 was digested completely with Bal ii and Xho l and an 

approximately 850 bp fragment from a region of the 
P. pastoris A0X1 locus downstream from the AOX1 gene 
transcription terminator (relative to the direction of 
transcription from the A0X1 promoter) was isolated. The 

15 Bcrl ll- Xho l fragment from pBSAGISI and the approximately 

2.97 kbp, Sall-Bglll fragment from pBR3 22 aRIBGL were 
combined and subjected to ligation with T4 ligase. The 
ligation mixture was used to transform ILt. coli MC1061 
cells to ampicillin-resistance and transf ormants were 

20 screened for a plasmid of the expected size 

(approximately 3.8 kbp) with a Bal ll site. This plasmid 
was designated pAOSOl. The overhanging end of the Sail 
site from the pBR322aRIBGL fragment was ligated to the 
overhanging end of the Xho l site on the 850 bp pBSAGISI 

25 fragment and, in the process, both the Sai l site and the 

Xho l site in pAOSOl were eliminated. 

pBSAGI5I was then digested with Cla l and the 
approximately 2.0 kbp fragment was isolated* The 2.0 kbp 
fragment has an approximately 1.0 -kbp segment which 

30 comprises the P^_ pastoris AOX1 promoter and transcription 

initiation site, an approximately 700 bp segment encoding 
the hepatitis B virus surface antigen (••HBsAg* 1 ) and an 
approximately 300 bp segment which comprises the 
P. pastoris AOX1 gene polyadenylation signal and site- 

35 encoding segments and transcription terminator. The 

HBsAg coding segment of the 2.0 kbp fragment is 
terminated, at the end adjacent the 1.0 kbp segment with 
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the AOX1 promoter, with an EcoRI site and, at the end 
adjacent the 300 bp segment with the AOXl transcription 
terminator with a StuI site, and has its subsegment which 
codes for HBsAg oriented and positioned, with respect to 
5 the 1.0 kbp promoter-containing and 3 00 bp transcription 

terminator-containing segments/ operatively for 
expression of the HBsAg upon transcription from the AOX1 
promoter. The EcoRI site joining the promoter segment to 
the HBsAg coding segment occurs just upstream (with 

10. respect to the direction of transcription from the AOXl ' 

promoter) from the translation initiation signal-encoding 
triplet of the AOXl promoter. 

For more details on the promoter and terminator 
segments of the 2.0 kbp , £lal -site-terminated fragment of 

15 pBSAGISI, see European Patent Application Publication No. 

0,226,84 6 and Ellis et al., Mol. Cell Biol - 5, 1111 
(1985) . 

Plasmid pA0801 was cut with Clal and combined 
for ligation using T4 ligase with the approximately 2.0 

20 kbp Clal -site- terminated fragment from pBSAGISI. The 

ligation mixture was used to transform Ej. coli MC1061 to 
ampicillin resistance, and transf ormants were screened 
for a plasmid of the expected size (approximately 5.8 
kbp) which, on digestion with Cla l and figlll, yielded 

25 fragments of about 2.32 kbp (with the origin of 

replication and ampicillin-resistance gene from pBR322) 
and about 1.9 kbp, 1.48 kbp, and 100 bp. On digestion 
with Bcrl ll and EcoR I P the plasmid yielded an 
approximately 2.48 kbp fragment with the 300 bp 

30. terminator segment from the AOXl gene and the HBsAg 

coding segment, a fragment of about 900 bp containing the 
segment from upstream of the AOXl protein encoding 
segment of the AOXl gene in the AOXl locus, and a 
fragment of about 2.42 kbp containing the origin of 

35. replication and ampicillin resistance gene from pBR322 

and an approximately 100 bp Clal-Bglll segment of the 
AOXl locus (further upstream from the AOXl-encoding 
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segment than the first mentioned 900 bp Eco RI- BqI II 
segment) . Such a plasmid had the Cla l fragment from 
pBSAGI5I in the desired orientation, in the opposite 
undesired orientation, there would be Eco RI -Ball I 
]> fragments of about 3,3 kbp, 2.38 kbp and 900 bp. 

One of the transf ormants harboring the desired 
plasmid, designated pA0802, was selected /for further work 
and was cultured to yield that plasmid. The desired 
orientation of the Cla l fragment from pBSAGI5I in pA0802 

10 had the AOX1 gene in the AQX1 locus oriented correctly to 

lead to the correct integration into the J^. pastoris 
genome at the A0X1 locus of linearized plasmid made by 
cutting at the Bglll site at the terminus of the 800 bp 
fragment from downstream of the AOX1 gene in the 

15 locus. 

pA0802 was then treated to remove the HBsAg : 
coding segment terminated with an EcoR I site and a Stu I 
site. The plasmid was digested with Stu I and a linker of 
sequence: 

5 1 -GGAATTCC-3 1 
3 1 -CCTTAAGG-5 V 
was ligated to the blunt ends using T4 ligase. The 
mixture was then treated with Eco RI and again subjected 

25 to ligating using T4 ligase. The ligation mixture was 

then used to transform JL. col i MC1061 cells to ampiciliin 
resistance and transf ormants were screened for a plasmid 
of the expected size (5.1 kbp) with ESfiRI-BgjLII fragments 
of about 1.78 kbp, 900 bp, and 2.42 kbp and £gill-£lal 

30 fragment of about 100 bp, 2.32 kbp, 1.48 kbp, and 1.2 

kbp. This plasmid was designated pA0803. A transfqrmant 
with the desired plasmid was selected for further work 
and was cultured to yield pA0803. 

Plasmid pA0804 was then made from pA08 03 by 

35 inserting, into the BamH I site from pBR322 in pA0803, an 

approximately 2.75 kbp Bglll fragment from the 
p/ pastoris HIS4 gene. See, e.g. , Cregg et al . . Mol . 
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Cell. Biol, 5, 3376 (1985) and European Patent 
Application Publication Nqs. 0/180,899 and 0,188,677. 
pA0803 was digested with BamHI and combined with the HIS4 
gene-cpntaining Bgill site-terminated fragment and the 
5. mixture subjected to ligation using T4 ligase. The 

ligation mixture was used to transform E^ coli MC1061 
cells to ampicillin-resistance and trans formants were 
screened for a plasmid of the expected size (7-85 kbp) , 
which is cut by Sai l. One such transformant was selected 

10 for further work, and the plasmid it liarbors was 

designated pA0804. 

PA0804 has one Sal l- Cla l fragment of about 1.5 
kbp and another of about 5.0 kbp and a Cla l- Cla l fragment 
of 1.3 kbp; this indicates that the direction of 

15 transcription of the HIS4 gene in the plasmid is the same 

as the direction of transcription of the ampicillin 
resistance gene and opposite the direction of 
transcription from the A0X1 promoter. 

The orientation of the HIS4 gene is pA0804 is 

20. not critical to the function of the plasmid or of its 

derivatives with cDNA coding segments inserted at the 
FcoRI site between the A0X1 promoter and terminator 
segments. Thus, a plasmid with the HIS4 gene in the 
orientation opposite that of the HIS4 gene in pA0804 

25. would also be effective for use in accordance with the 

present invention. 

Example 2 

Strain development and characterization 
Plasmid pSCD103, the construction of which is 
30. described in Example 1, was used to develop Mut* and Mut* 

strains of P. pastoris . The His" strain GS115 (ATCC 
20864) was the host for all transformations, 
Transformations were accomplished by the whole-cell LiCl 
method Tito et al. r J, Bacteriol . 153 (l) , 163 (1983)], 
35 with minor modification necessary for adaptation to 

P. pastoris . 
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To develop Mut^ strains , pSCD103 was digested 
with Sad , which linearizes the vector within the ftOXj 
promoter region, and 10 tig of the linearized vector were 
used to transform GS115. Histidine prototrophs Were 
5. selected. 

To develop Mut" strains, pSCD103 was digested 
with Bglll thereby liberating an expression cassette 
comprised of the AOX1 promoter region, aMF leader-V, gene, 
AOXl transcription termination signals, HXS4 gene for 

10 selection, and AOXl 3' region, fioth ends of this 

expression cassette contain long sequences which are 
homologous to the 5' and 3' ends of the AOXl locus. 10 /xg 
of the linearized vector were used to transform GS115 
cells. Histidine prototrophs were selected and screened 

15 for the Mut" phenotype by replica plating colonies from 

glucose containing media to methanol containing media, 
and evaluating growth rate on methanol. Slow growth on 
methanol was indicative of the Mut" phenotype. Several 
His 4 Mut" colonies were identified* 

20 To characterize the Mut* and Mut" transf ormants 

for cassette copy number and site of integration, DNA 
from several of the selected colonies was digested with. 
EcoRI and probed with nick-translated pSCD103. The 
Southern analysis yielded the following information: 



25 



30 



strain name Mut* 7 " Co py Sjte of 

Number Integration 
G+SCD103S03 Mut* one ft 0X1 

G+SCD103S16 Mut* two hQ2LL 

G-SCD103S03 Mut" one AOXl (disruption) 



Example 3 

fermentation in two-liter fermentors 
a. ^ermentor start-up and gene ral operation 
35 The 2-liter fermentors (Biolafitte, LSL 

Biolafitte, Princeton, NJ) were autoclaved at a volume of 
one liter containing 5X Basal Salts (21 ml/1 85% 
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phosphoric acid, 0.9 g/1 Calciuin Sulfate x 2H 2 0, 14.3 g/1 
Potassium Sulfate, 11.7 g/1 Magnesium Sulfate x 7H20, 
3.25 g/1 Potassium Hydroxide) and 5% (w/v) glycerol. 
After sterilization, 5 ml of a PTM t trace salts solution 
5 (6.0 g/1 Cupric Sulfate x 5H 2 0, 0.08 g/1 Sodium Iodide, 

3.0 g/1 Manganese Sulfate x H 2 0, 0.2 g/1 Sodium Molybdate 
x 2H 2 0, 0.02 g/1 Boric Acid, 0.5 g/1 Cobalt Chloride, 20 
g/1 Zinc Chloride, 65 g/1 Ferrous Sulfate x 7H 2 0, 0.2 g/1 
Biotin, and 5.0 ml of Sulfuric Acid) was added, and the 

10 pH of the f ennentor was adjusted to 3.0 with the addition 

of concentrated Ammonium Hydroxide. During the 
fermentation, the pH was controlled at 3.0 with the 
addition of a 50% (v/v) Ammonium Hydroxide solution. 

The fermentors were then inoculated with 50 ml 

15 of an overnight culture grown in 6.75 g/1 Difco yeast 

nitrogen base, 2% glycerol, 0.1 M potassium phosphate, pH 
6,0- After 16 hours of fermentor growth, the pH of the 
medium was dropped to 2.6 and the cells continued to grow 
in a batch mode to exhaust the original charge of 

20 glycerol. Upon glycerol exhaustion, a 50% (w/v) glycerol 

feed containing 12 ml/1 PTM 1 trace salts was initiated at 
a feed rate of 5 to 2 0 ml/h. After 200 ml of the 
glycerol feed was added into the fermentor, a 100 % 
methanol feed containing 12 ml/1 PTM 1 trace salts, was 

25 initiated at 1 ml/h, and the glycerol feed was shut off 

after 1 hour of methanol feeding. After 4 hours of 
methanol feeding, the methanol feed was increased to 5-6 
ml/h over an 8-12 hour period and was maintained at this 
rate for the remainder of the fermentation. The 

30 dissolved oxygen concentration was maintained above 20% 

of saturation by adjusting 

agitation and aeration as needed. The temperature was 
controlled at 30'C and foaming was controlled by the 
addition of a 5% solution of Struktol J-673 antifoam 
35 (Struktol Co., Stow, OH). 

Before harvesting the fermentor, the pH was 
decreased to 2.5 with the addition of 85 % phosphoric 
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acid. The contents were then centrifuged to remove cells 
and the supernatant was filter-sterilized through a 
0.22 \x Corning filter (Corning Glass Co., Corning, NY). 
The supernatant was then frozen at -20 *C. 
5 b. Growth of Mut+ and Muf strains 

Run 568: G+SCD103S03 

Run 570: G+SCD103S16 

Run 571: G-SCD103S03 

Run 585: G+SCD103S03 
10 Run 593: G+SCD103S16 

Fermentation Runs 568, 570, 571, 585, and 593 
were conducted as described above, except that Runs 568, 
570, and 571 were conducted at pH 5.0; Run 585 was 
performed at pH 3.5 and the pH was not adjusted to pH 2.5 
JL5 at the end of the fermentation run; Run 593 was conducted 

as hereinabove described. 

Figure 6 shows the time course for cell yield 
for one-liter fermentation runs with strains G+SCD103SO3 
(Run 568), G+SCD103S16 (Run 570), and G-SCD103S03 (Run 
20 571). Cell yield was calculated as the mass of wet cells 

per liter of broth after centrifugation. A conversion 
factor of 0.25 was used to calculate yield of dry cellfc 
per liter. 

The single-copy Mut* and Mut' strains grew at 
25 equivalent rates, whereas the two-copy Mut* strain showed 

slightly decreased cell yield on methanol. However, 
because Pichia transf ormants carrying multiple copies of 
an expression cassette may express higher levels of 
heterologous proteins in the fermentor than do the 
30 strains with a single-copy, another fermentation, Run 593 

was conducted to analyze the level of recombinant CD-V, in 
the broth of the two-copy strain, G+SCD103S16. 

Figures 7A and 7B show the time course for cell 
yield and CD4-V 1 production, respectively,, or 
35 fermentation Run 593. The expression level was m 

determined for unfiltered, reduced broth samples, and was 
estimated by quantitative Western blot analysis to be, 130 
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Big/liter after 71 hours on - methanol • The level was 
continuing to increase when the fermentor was harvested. 

Example 4 

5. Analysis of secreted CD4-V 1 

a. Western blot analysis 

The V 1 region of CD4 contains a single disulfide 
bond between two cysteine residues, which are located at 
positions 16 and 84 of the mature CD4, near the N- and Cl- 
io termini, respectively. Therefore, non-reduced V t 

molecules from fermentor broth samples will co-migrate 
with the V, standard, regardless of whether the molecule 
has been nicked between the cysteines. On the other 
hand, reduced samples will only co-migrate with the 
15 standard if the peptide bonds between the cysteines are 

intact. Separating pastor is broth samples on non- 
reducing gels yielded a quantitative measurement of the 
total amount of V 1 contained in the fermentor broth, while 
reducing gels yielded the amount of intact V t . 
20. Fermentor broth samples from Runs 571 (Mut") , 

568 and 585 (single copy Mut*) and 593 (multi-copy Mut*) 
were analyzed by Western blotting. Ten microliters of 
each sample were mixed with an equal volume of 2x Laemmli 
-sample buffer containing 200 mM DTT (+DTT) . In some 
25 cases, the 2x sample buffer lacked any reducing agent 

(-DTT) . Fermentor samples and 2X sample buffer were 
mixed and immediately boiled for 5 minutes (+DTT) , or 
mixed and immediately placed at room temperature until 
the gel was loaded (-DTT) . Samples thus prepared were 
30. separated by electrophoresis, at 4*C, on 15% SDS-PAGE 

gels, at 150V constant voltage, until the bromophenol 
blue tracking dye had reached the bottom of the gel. 

Ea. coli produced V, (Smith Kline & French), used 
as standard, was treated in an identical manner, and 
35 separated on the same gels as a standard control . 

Reduced (+DTT) and non-reduced (-DTT) samples were 
separated on different gels. For quantitation of the 
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pas tor is produced V, in f ermentor broth, f ermentor 
samples were separated in non-adjacent lanes, to prevent 
spillover errors,, and several different amounts of V, 
standards were separated on the same gel to generate an 
5 internal standard curve. 

Gels were transblotted to nitrocellulose (0.1 n 
pore size) for 90 minutes at 4'C, using a carbonate 
buffer system [S.D.Dunn, Anal. Biochem. 15.7, 144 (1986)]. 
The filters Were blocked for 16 hours at room temperature 

10 in Western blocking buffer (WBB) , incubated for two hours 

with a 1:1000 dilution of rabbit anti-sCD4 [SK&F; Arthos 
et al. r Cell 52, 469 (1989)] in WBB at room temperature, 
washed four-times for 15 minutes in WBB , incubated for 
one hour at room temperature in a 1:5000 dilution of low 

15 specific activity 125 l-Protein A (New England Nuclear) in 

WBB, washed four-times for 15 minutes each time in WBB, 
air dried and exposed to Kodak X-omat film at -70 'C, with 
two intensifying screens. V t bands, identified by 
reaction with anti-sCD4, were excised from the 

£p_ nitrocellulose filters and quantitated using a gamma- 

counter. The f ermentor samples were quantitated by 
comparison with the V, standard curve on each filter. The 
results of these analyses are summarized in the following 
Table : 



25 








V 1- 


YIELD fma/L) 




PUN 


MUT+/- 


EH 


TOTAL 


TNTACT % jntact 




571 


Muf 


5.0 


100 


0 


o 




568 


single copy Mut* 


5.0 


100 


0 






585 


single copy Mut* 


3.5 


125 


30 


25 


30 


570 


two -copy Mut* 


5.0 


100 


0 


0 




593 


two-copy Mut* 


2.6 


100 


100 


100 



b. Amino Ac id Sequencing 

We have further characterized the PAch/Aft- 
35 produced recombinant CD4-V, by determining the N-terminal 

sequence of the protein which com igr at es with the V n \ 
standard on reducing SDS-PAGE. As shown in the silver- 
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stained reducing gel of V 1 standard and fennentor broth 
samples pictured in Figure 8, control fermentor broth 
obtained from the fermentation of Pichia strain G-PA0815 
does not contain a protein which comigrates with the V 1 
5 standard. In contrast, Pichia 'rCTM-V, appears to be the 

major lower molecular weight protein species present in 
broth samples from the fermentation of strain 
G+SCD103S16. To ensure that broth components did not 
affect the migration or staining of V 1 in polyacrylamide 

AO gels, Ej^ coljL-derived V n standard and rCD4-V, -containing 

broth from fermentation of G+SCD103S16 were separately 
mixed with Pichia control broth and analyzed by reducing 
SDS-PAGE. As shown in Figure 8, the electrophoretic 
characteristics of the V n standard and Pichia -produced 

A5 rCD4-V r were unaltered by exposure to Pichia control 

broth; the V, standard and Pichia rCD4-v ] co-migrated to 
the same gel position and exhibited similar staining 
. properties in the presence and absence of control broth. 
The first 15 residues of Pichia rCD4-V, were determined, 

20 and found to be identical to the published sequence 

(Maddon et al. , Supra) for mature human CD4-V, (Figure 9), 
From this result, it was concluded that the N-terminus of 
the recombinant CD4-V, was correctly processed from the 
• aMF leader. 

25 c. Stability 

The stability of the v, molecules in the 
fermentor broth, which had been adjusted to pH,2.5, was 
analyzed in two ways. First, the stability of V t during 
storage was analyzed by subjecting identical broth 

30 samples to a freeze-thaw cycle, followed by incubation of 

the samples for 20 hours, under varying conditions 
(Figure 10, lanes 2-4) . No change was observed in the 
amount of immunoreactive material or the proportion of 
intact V 1 in the different samples. In the second 

35 experiment, the pH of the broth samples was raised to 5.0 

and the samples incubated under the same conditions as 
before (Figure 10, lanes 5-7) . As seen in the Figure, 
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some broadening of the intact V, band occurred under these 
conditions. Therefore, while the rCD4-V 1 is stable in the 
pH 2.5 broth samples, some degree of proteolytic 
degradation may occur in samples at elevated pHs. 
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CLAIMS: 

1. A Pichia pastoris ( P. pastoris ) cell 
containing in its genome at least one copy of a DNA 
sequence operably encoding in P^ pastoris at least a 

5 portion of human CD4 glycoprotein, containing the site of 

interaction between CD4 and the human immunodeficiency 
virus (HIV) , in operational association with a DNA 
sequence encoding a signal sequence which functions to 
direct secretion of said human CD4 glycoprotein or a 
10 portion thereof in P^. pastoris . both under the regulation 

of a promoter region of a P^ pastoris gerie. 

2. A P_-_ pastoris cell according to Claim 1, 
wherein said signal sequence-encoding DNA comprises a DNA 
sequence encoding the S_s. cerevisiae AMF pre-pro sequence, 

AS and a DNA sequence encoding AMF processing-site lys-arg. 

3. A P^ pastoris cell according to Claim 2, 
wherein said P^ pastoris gene is the £^ pastoris A0X1 
gene. 

A . AL pastoris cell according to Claim 3 
20. wherein said DNA sequence operably encodes in £^ pastoris 

the V, region of human CD4 glycoprotein. 

5. A L. pastoris cell according to Claim 4 
containing at least two copies of said DNA sequences. 

6. A pastoris cell containing in its 
25. genome at least one copy of an expression cassette 

comprising in the direction of transcription, a promoter 
region of a first P. pastoris gene, a DNA sequence 
operably encoding in P^ pastoris at least a portion of 
human CD4 glycoprotein, containing the site of 

30. interaction between CD4 and the HIV virus, preceeded by a 

DNA sequence encoding a signal sequence directing the 
secretion of said glycoprotein or a portion thereof in P. 
pastoris , and a transcription terminator of a second P. 
pastoris gene, said first and second P. pastoris genes 

35. being identical or different, the segments of said 

expression cassette being in operational association. 
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7. a pastoris cell according to Claim 6, 
wherein said signal sequence-encoding DNA comprises a DNA 
sequence encoding the cerevisiae AMF pre-pro sequence 
and a DNA sequence encoding AMF processing-site lys-arg. 
5 8. A JPj. pastoris cell according to Claim 7 

wherein said first and second pastoris genes are 
identical and are the P. pastoris gene. 

9. a L. pastoris cell according to Claim 8 
wherein said DNA sequence operably encodes in 2^ p»£torjs 

10 the V, region of human CD4 glycoprotein. 

10. A L. pastoris cell according to Claim 9 
containing at least two copies of said expression 
cassette. 

11. A £i pastoris cell according to Claim 10, 
15 containing two copies of said expression cassette 

integrated by addition at the AQX1 locus of said Z*. 
pastoris genome, 

12. A ILs. pastoris cell according to Claim 9, 
containing a single copy of said expression cassette 

20 integrated by addition at the AOX1 locus of said ' ■ 

pastoris genome. 

13. A Zi. pastoris cell according to Claim 9, 
containing a single copy of said expression cassette 
integrated by gene replacement at the locus of said 

2 5 P. pastoris genome. 

14. A DNA fragment optionally contained 
within, or which is, a circular plasmid comprising at 
least one copy of an expression cassette comprising in 
the direction of transcription, a promoter region of a 

30 first P. pastoris gene, a DNA sequence operably encoding 

in pastoris at least a portion of human CD4 
glycoprotein, containing the site of interaction between 
CD4 and the HIV virus, preceeded by a DNA sequence 
encoding a signal sequence directing the secretion of 

35 said glycoprotein or a portion thereof in Z*. pastoris, 

and a transcription terminator of a second A p^s^QFjs 
gene, said first and second P^ pastoris genes being 
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identical or different, the segments of said expression 
cassette being in operational association. 

15. A DNA fragment according to Claim 14 , 
wherein said signal sequence-encoding DNA comprises a DNA 

5 sequence encoding the S. cerevisiae avtp pre-pro sequence, 

and a DNA sequence encoding AMF processing-site lys-arg. 

16. A DNA fragment according to Claim 15, 
wherein said first and second P,. pastoris genes are 
identical and are the nastoris A0X1 gene. 

AO 17. A DNA fragment according to Claim 16 

wherein said DNA sequence operably encodes in £^ pastoris 
the V, region of human CD4 glycoprotein. 

18. A DNA fragment according to Claim 16, 
further comprising a selectable marker gene and ends 
having sufficient homology with a target gene to effect 
integration of said DNA fragment therein. 

19 * A DNA fragment according to Claim 18, 
wherein said target gene is the ^ pastoris AOX1 gene. 

20. A DNA fragment according to Claim 18 which 
2£ is a Bglll digest of the expression vector pSCD103. 

21. A DNA fragment according to Claim 18, 
which is a SacI digest of the expression vector pSCD103 . 

22. An expression vector containing at least 
one copy of an expression cassette comprising in the 

£5 direction of transcription, a promoter region of a first 

£jl pastoris gene, a DNA sequence operably encoding in P. 
pastoris at least a portion of human CD4 glycoprotein, 
containing the site of interaction between CD4 and the 
HIV virus, preceeded by a DNA sequence encoding a signal 
sequence directing the secretion of said glycoprotein or 
a portion thereof in £^ pastoris , and a transcription 
terminator of a second pastoris gene, said first and 
second Zju pastoris genes being identical or different, 
the segments of said expression cassette being in 
operational association. 
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23. An expression vector according to Claim 

22, wherein said signal sequence-encoding DNA comprises a 
DNA sequence encoding the cerevlsiae AMF pre-pro 
sequence, and a DNA sequence encoding AMF processing-site 

5 lys-arg. 

24* An expression vector according to Claim 

23, further comprising sequences allowing for its 
replication and selection in bacteria. 

25. An expression vector according to Claim 
10 24, which is a pBR322 derivative. 

26. An expression vector according to Claim 
25, which is the Pichia expression vector pSCD103. 

27. A culture of viable P. i pastoris cells 
according to any one of Claims 1 to 13. 

15 28. A process for producing and secreting at 

least a portion of human CD4 glycoprotein, containing the 
site of interaction between CD4 and the HIV virus, into 
the culture medium comprising growing p. pastoris 
trans formants containing in their genome at least one 

20 copy of a DNA sequence operably encoding in £±. pastoris 

at least a portion of human CD4 glycoprotein, containing 
the site of interaction between CD4 and the HIV virus, in 
operational association with a DNA sequence encoding a 
signal sequence directing the secretion of said f 

25 glycoprotein or a portion thereof in p. pastoris , both 

under the regulation of a promoter region of a £^ 
pastoris gene, under conditions allowing the expression 
of said DNA sequences in said P. pastoris and secretion 
of said glycoprotein or a protein thereof into : the 

30 culture medium in a substantially pure form, 

substantially devoid of degradation products. 

29. A process according to Claim 28, wherein 
said signal sequence is the S. cerevisiae AMF pre-pro 
sequence. 

35 30. A process according to Claim 29, wherein 

said transformants are developed from the £^ pastoris 
his4* strain GS115. 
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31. A process according to Claim 30, wherein 
said trans formants have the Mut* phenotype. 

32. A process according to Claim 28, which 

comprises: 

S growing said P, pastoris transf ormants on a 

medium containing repressing carbon source to generate 
cell mass in absence of heterologous gene expression, 

b. continuing growth under glycerol limitation 
conditions, and 

ifi. c. initiating heterologous gene expression by 

adding methanol to the medium, and keeping the pH at or 
below about 3.5 during said heterologous gene expression/ 

33. A process according to Claim 32, wherein 
the pH is kept between about 2.5 and about 3.5 during 

15 heterologous gene expression. 

34. A process according to Claim 33, wherein 
the pH is kept between about 2.5 and about 3.0 during 
heterologous gene expression. 

35. A process according to any one of Claims 
ZSL 28 to 34, further comprising the step of harvesting said 

human CD4 glycoprotein or a portion thereof from the 
culture medium. 

36. A process for producing a heterologous 
protein in pastoris, wherein the pH of the culture 

25. medium is miaintained at or below about 3.5 during 
heterologous gene expression. N 

37. A process according to Claim 36, wherein 
said heterologous protein is secreted into the 
fermentation medium. 

^ .38. A process according to Claim 36, wherein 

the pH is maintained between about 2.5 and about 3.5. 

39. Substantially pure human CD4 glycoprotein 
or a portion thereof containing the site of interaction 
between CD4 and the human immunodeficiency virus (HIV) 

35 produced in yeast. 

40. Substantially pure human CD4 glycoprotein 
or a portion thereof according to Claim 39 produced in 
pastorjs. 
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CAAGCCCA6AGCCCT6CGATTTCTGTGG6CTCAGGTCCGTACTGCTCAGCCCCTTCCTCC 

-20 

met asn arg gly val pro phe arg his leu leu 

CTCGGCAAGGCCACA ATG AAC CGG GGA GTC CCT TTT AGG CAC TTG CTT 108 

-10 -1 +1 

leu val leu gin leu ala leu leu pro ala ala thr gin gly asn 

CTG GTG CTG CAA CTG GCG CTC CTC CCA GCA GCC ACT CAG GGA AAC 

+10 . * 

lys val val leu gly lys lys gly asp thr val glu leu thr cys 

AAA GTG GTG CTG GGC AAA AAA GGG GAT ACA GTG GAA CTG ACC TGT 198 

+20 +30 

thr ala ser gin lys lys ser ile gin phe his trp lys asn ser 

ACA GCT TCC CAG AAG AAG AGC ATA CAA TTC CAC TGG AAA AAC TCC 



+40 

AAC CAG ATA AAG ATT CTG GGA AAT CAG SGC TCC TTC TTA ACT AAA 288 



asn gin ile Tys He leu gly asn gin gly ser ^he leu thr lys 



+50 +60 
gly pro ser lys leu asn asp arg ala asp ser arg arg ser leu 
GGT CCA TCC AAG CTG AAT GAT CGC GCT GAC TCA AGA AGA AGC CTT 

+70 

trp asp gin gly asn phe pro leu ile ile lys asn leu lys lie 

TGG GAC CAA GGA AAC TTC CCC CTG ATC ATC AAG AAT CTT AAG. ATA 378 

+80 * +90 

glu asp ser asp thr tyr ile cys glu val glu asp gin lys glu 
GAA GAC TCA GAT ACT TAC ATC TGT GAA GTG GAG GAC CAG AAG GAG 

+100 

glu val gin leu leu val phe gly leu thr ala asn ser asp thr 

GAG GTG CAA TTG CTA GTG TTC GGA TTG ACT GCC AAC TCT GAC ACC 468 

+110 +120 
his leu leu gin gly gin ser leu thr leu thr leu glu ser pro 
CAC CTG CTT CAG GGG CAG AGC CTG ACC CTG ACC TTG GAG AGC CCC 



+130 
val c 

CCT GGT AGT AGC CCC TCA GTG CAA TGT AGG AGT CCA AGG GGT AAA 558 

FIG. 11-1 



pro gly ser ser pro ser val gin cys arg ser pro arg gl)/ lys 
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+140 +150 
asn lie gin gly gly lys thr leu ser valser gin leu glu leu 
AAC ATA CAG GGG GGG AAG ACC CTC TCC GTG TCT CAG CTG GAG CTC 

+ 160 * 

gin asp ser gly thr trp thr cys thr val leu gin asn gin lys 

CAG GAT AGT GGC ACC TGG ACA TGC ACT GTC TTG CAG AAC CAG AAG 648 

+170 +180 
lys val glu phe lys lie asp ile val val leu ala phe gin lys 
AAG GTG GAG TTC AAA ATA GAC ATC GTG GTG CTA GCT TTC CAG AAG 

+190 

ala ser ser lie val tyr lys lys glu gly glu gin val glu phe 

GCC TCC AGC ATA GTC TAT AAG AAA GAG GGG GAA CAG GTG GAG TTC 738 

+200 +210 
ser phe pro leu ala phe thr val glu lys leu thr gly ser gly 
TCC TTC CCA CTC GCC TTT ACA GTT GAA AAG CTG ACG GGC AGT GGC 

+220 

glu leu trp trp gin ala glu arg ala ser ser ser lys ser trp 

GAG CTG TGG TGG CAG GCG GAG AGG GCT TCC TCC TCC AAG TCT TGG 828 

+230 +240 
ile thr phe asp leu lys asn lys glu val ser val lys arg val 
ATC ACC TTT GAC CTG AAG AAC AAG GAA GTG TCT GTA AAA CGG GTT 

+250 

thr gin asp pro lys leu gin met gly lys lys leu pro leu his 

ACC CAG GAC CCT AAG CTC CAG ATG GGC AAG AAG CTC CCG CTC CAC 918 

+260 +270 
leu thr leu pro gin ala leu pro gin tyr ala gly ser gly asn 
CTC ACC CTG CCC CAG GCC TTG CCT CAG TAT GCT GGC TCT GGA AAC 

+280 

leu thr leu ala leu glu ala> lys thr gly lys leu his gin glu 

CTC ACC CTG GCC CTT GAA GCG AAA ACA GGA AAG TTG CAT CAG. GAA 1008 

+290 +300 
val asn leu val val met arg ala thr gin leu gin lys asn leu 
GTG AAC CTG GTG GTG ATG AGA GCC ACT CAG CTC CAG AAA AAT TTG 

+310 

thr cys glu val trp gly pro thr ser pro lys leu met leu ser 

ACC TGT GAG GTG TGG GGA CCC ACC TCC CCT AAG CTG ATG CTG AGC 1098 



FIG. 1 1-2 
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+320 +330 
leu lys leu glu asn lys glu ala. lys val ser lys arg glu lys 
TTG AAA CTG GAG AAC AAG GAG GCA AAC GTC TCG AAG CGG GAG AAG 

; +340 * 
ala val trp val leu asn pro glu ala gly met trp gin cys leu 
GCG GTG TGG GTG CTG AAC CCT GAG GCG GGG ATG TGG CAG TGT CTG 1188 

+350 +360 
leu ser asp ser gly gin val leu leu glu ser asn ile lys val 
CTG AGT GAC TCG GGA CAG GTC CTG CTG GAA T.CC AAC ATC AAG GTT 

+370 , 
leu pro thr trp ser thr pro val gin pro met ala leu ile yal • 
CTG CCC ACA TGG TCC ACC CCG GTG CAG CCA ATG GCC CTG ATT GTG 1278 

+380 +390 
leu gly gly val ala gly leu leu leu phe ile gly leu gly lie 
CTG GGG GGC GTC GCC GGC CTC CTG CTT TTC ATT GGG CTA GGC ATC 

+400 

phe phe cys val arg cys arg his arg arg arg gin ala glu arg 

TTC TTC TGT GTC AGG TGC CGG CAC.CGA AGG CSC CAA GCA GAG CGG 1368 

+410 +420 
met ser gin ile lys arg leu leu ser glu lys lys thr. cys gin 
ATG TCT CAG ATC AAG AGA CTC CTC AGT GAG AAG AAG ACC TGC CAG 

+430 . 

cys. pro his arg phe gin lys thr cys ser pro ile , • 

TGC CCT CAC CGG TTTCAG AAG ACA TGT AGC CCC ATT TGA GGCACGA 1459 

GGCCAGGCAGATCCCACTTGCAGCCTCCCCAGGTGTCTGCCCCGCGTTTCCTGCCTGCGG 

ACCAGATGAATGTAGCAGATCCCACGCTCTGGCCTCCTGTTCGTCCTCCCTACAATTTG 1578 

CCATTGTTTCTCCTGGGTTAGGCCCCGGCTTCACTGGTTGAGTGTTGCTCTCTAGTTTCC 

AGAGGCTTAATCACACCCTCCTCCACGCCATTTCCTTTTCCTTCAAGCCTAGCCCTTCT 1697 

CTCATTATTTCTCTCTGACCCTCTCCCCACTGCTCATTTGGATCC 1742 

FIG. 11-3 
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